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Index Coding with Coded Side-Information 

Namyoon Lee, Alexandras G. Dimakis, and Robert W. Heath Jr. 


Abstract —This letter investigates a new class of index coding 
problems. One sender broadcasts packets to multiple users, 
each desiring a subset, by exploiting prior knowledge of linear 
combinations of packets. We refer to this class of problems 
as index coding with coded side-information. Our aim is to 
characterize the minimum index code length that the sender 
needs to transmit to simultaneously satisfy all user requests. 
We show that the optimal binary vector index code length 
is equal to the minimum rank (minrank) of a matrix whose 
elements consist of the sets of desired packet indices and side- 
information encoding matrices. This is the natural extension 
of matrix minrank in the presence of coded side information. 
Using the derived expression, we propose a greedy randomized 
algorithm to minimize the rank of the derived matrix. 

Index Terms —Index coding and coded side-information 

I. Introduction 

Index coding in 0 0 is a transmission technique for 
noiseless broadcasting channel consisting of a transmitter and 
a set of users. The transmitter wishes to deliver multiple 
packets to their respective users over a shared noiseless link. 
Each user has its own prior knowledge of a subset of the 
packets. The transmitter sends a signal per time slot and all the 
users receive it without noise. The goal is to design transmit 
codes to minimize the number of required transmissions so 
that all users decode the desired packets with their own side- 
information and the received signals from the transmitter. 
This class of problems has recently received attention because 
of its connections to network coding 0 and topological 
interference management 0. Designing an efficient index 
code is tightly related with the constructing codes for caching 
[7] and distributed storage systems [8|. 

There has been extensive work on characterizing the optimal 
index code length (the minimum number of transmissions) 
0 0 Approaches based on graph theory are popular be¬ 
cause of the strong connection between the optimal index code 
length and graph-theoretical quantities 0,0. For instance, 
when each user wants distinct packets, an index code design 
problem is equivalently represented in terms of a directed side- 
information graph G. It was shown in (2 ] that, for the given 
direct side-information graph G, the optimal index code length 
is lower and upper bounded by the maximum independent set 
number of the corresponding graph, a(G), and the chromatic 
number of its complement, %(G). These approaches 0-i 
are useful in characterizing the bounds of the optimal index 
code length and in obtaining the optimal index code for a 
certain class of side-information graphs (e.g., vertex-coloring 
methods ©)• 
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Algebraic approaches are also effective methods to charac¬ 
terize the optimal index code length. One key result is that the 
optimal binary index code length equals to the minimum rank 
(minrank) of a matrix that fits the side-information graph G, 
i.e., minrk(G) [2|. This algebraic expression yields a new way 
of constructing index codes by solving a matrix completion 
problem over a finite field. 

In this letter, we consider a generalization of index coding 
when the side information packets can be themselves coded. 
Specifically, unlike the conventional assumption that each user 
independently knows a subset of other users’ packets as side- 
information 0-0, © we admit a coded structure in 
generating side-information so that each user is able to have 
any linear combinations of all packets as side-information. As 
is well-known, index coding is already tremendously challeng¬ 
ing even the side information packets are not coded. However, 
the idea of allowing coded side information is useful (beyond 
mathematical interest as a natural generalization) since in 
many situations side information is created by overhearing 
previous transmissions which will be very frequently coded. 
This is especially relevant in caching scenarios where helpers 
try to assist in content dissemination [7|. 

To explain the index coding problem with coded side- 
information, we provide a motivating example. As depicted 
in Fig. [I] a transmitter desires to deliver a, b , and c to user 
1, 2, and 3, respectively. Let us first consider the uncoded 
side-information case where each user separately knows the 
others’ desired information bits. In such case, one XORed 
transmission a + 6 + c suffices to make all three users decode 
the desired information bits, if user 1, 2, and 3 stored two 
bits {6, c}, {a,c}, and {a, b}, where + represents an XOR 
operation or addition over the binary field F 2 . Next, let us 
consider a different scenario in which each user may only 
store one bit due to the lack of memory in the device. In this 
case, if user 1, 2, and 3 have coded side-information of b + c, 
a + c, and a + b, the same XORed transmission a + b + c 
are enough to satisfy all three users. This example reveals the 
benefit of coding over side-information in reducing the size of 
cache while maintaining the transmission rate. For the case of 
uncoded side-information, each user requires cache memory 
with two bits to decode the desired information bit from the 
XORed transmission a + b + c by the transmitter. Whereas, if 
each user strores the XORed bit instead of saperately storing 
them, cache memory with one bit is enough to extract the 
desired information bit. 

Our main contribution is to characterize the optimal binary 
vector linear index code length in terms of the minrank of 
a matrix when the users have coded side-information. Our 
key finding is that the minrank expression is a function of 1) 
a set of the packet indices requested by the users and 2) a 
set of side-information encoding matrices. With the derived 
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Fig. 1. A motivating example of the index coding problem with coded side- 
information. For the case of uncoded side-information, each user requires 
cache memory with two bits to decode the desired information bit from the 
XORed transmission a + 6+c by the transmitter. Whereas, if each user strores 
the XORed bit instead of saperately storing them, cache memory with one 
bit is enough to extract the desired information bit. 


equivalence between the optimal index code length and the 
minrank expression, we propose a random greedy algorithm 
that minimizes the rank of the derived matrix. The index 
coding problem with coded side-information, in fact, was 
initially considered in © where a linear index code with 
coded side-information can be found equivalently by solving 
a system of multi-variable polynomial equations, which is 
difficult to solve in general. We show how to design index 
codes by minimizing the rank of a derived matrix. This rank 
minimization formulation allows us to connect the index code 
design problem to a matrix completion problem over finite 
fields [ 10]. 


II. Problem Formulation 

Consider one transmitter and K users in a network. The 
transmitter has N packets, each with F bits, x n G Ff 1 for 
n G A f = {1,2,...,7V}. We denote a sequence of all packets 
by x = [xi,x 2 ,.. • ,xat] t G F fn . User k G {1, 2,... ,K} = 
/C requests a set of packets {x^} for i G Tk C J\f (Tk is 
a subset of AO- For example, if Tk = {1,2}, then user k 
desires to decode packets xi and x 2 . Further, user k G /C has 
coded side-information u k G ¥^ kF with the size of M&F 
bits for MkF G N. Assuming a linear encoding method, 
side-information given to user k, u k, is created by a side- 
information generating matrix S& G F^ kFxNF as 

u k = SfcX. (1) 


With knowledge of the set of encoding matrices {Si,..., S k}, 
the transmitter sends different linear combinations of packets x 
over LF time slots (channel uses) so that all users successfully 
decode the requested packets by exploiting their coded side- 
information u k- 

Under the restriction of linear coding, the transmitter uses 
an index coding matrix as an encoding function, i.e., Cic = 
[Ci, C 2 ,..., Cn] G F LFxArF . Note that the k- th sub-matrix 
C k G F2 FxF is the precoding matrix carrying the k- th packet 
Xfc. When the transmitter sends L packets with the index 
coding matrix Cic over a noiseless link, user k obtains the 
information vector y G over LF channel uses as 


y = Ci C x. 


Applying a linear decoder G F^ Fx ( L+Mfe ) F , user k G 
{1, 2,..., K} decodes packet x^ for i G Tk using both the 
received signal y and the coded side-information vector u 
The decodablity condition at user k is 

= RfcX, (3) 

where R k G FxNF denotes the index matrix of the 
requested packets by user k and R k / S Hence, the index 
matrix K k is a block matrix whose (i,U) sub-block is an 
identity matrix If if L G 7/~; otherwise the remaining blocks 
are zero matrices. With the decodability condition in ([3]), we 
define a valid linear index code and its optimal code length. 

Definition 1. (Valid linear index code) The index coding 
matrix Cic €= F FFx7VF is valid over F 2 with the length LF 
if every user is able to decode its desired sets of packets 
from the transmitted packets and side-information available 
at user k. In other words, all users simultaneously satisfy the 
decodability conditions in 0- 

Definition 2. (Optimal linear index code length) It is said 
that the index coding matrix Cic £ F 2 FxNF has the optimal 
length @2 if Cic £= F FFxNF is valid and with the minimum 
number of rows /3J = min LF. 



III. Main Results 

In this section, we characterize the minrank expression of 
the index code length for the class of index coding problems 
with coded side-information. The following theorem is the 
main result of this paper. 


Theorem 1. For the given set of side-information generating 
matrices {Si ,...,S k} and the desired packet index matrices 
{Ri,...,Rx}> the optimal linear vector index code length 
P 2 over F 2 is obtained by solving the following optimization 
problem: 


/? 2 *({S fc ,R fe }f =1 ) 


= min rk 

A T A T 

A 1 


Ri + AfSi 


Rk + A^S k 


(4) 


where A£ e ¥ \^\ FxM ^ F _ 

Proof: We prove Theorem [I] using an algebraic approach. 
Recall the decodability condition of user k G 1C in 0. We 
decompose the decoding matrix into two sub matrices 

Bl e w ] p lFxLF and A F e F jA|FxM fe F as 

B f =[B f A F ] , (5) 

where sub-matrices and are multiplied to the received 
signal vector y and side-information vector u^, respectively. 
With these sub-matrices, the decodability condition in Q at 
user k is equivalently decomposed as 

Bfey + AfeUfc = R fc x. (6) 


Using the fact that y = Cicx and u k = S&x, the decodability 
condition in ([6]) is rewritten as 

(B F Cic + Sfc) x = Rfcx. 


( 2 ) 


(7) 
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Since x is non-degenerate, the decodability condition in 0 
simplifies further as 

B^Cic + = Rfc, 

B^Cic = + A^S/e, (8) 


where the last equality is due to the addition over F 2 . Since 
every users needs to satisfy the decodability condition in ([ 8 }, 
the decodability condition for all the users is given by 



(Ef=ilT fc | )FxLF 


Ri + A^Si " 

R/\ + A^S k 

(£f=i \r k \)FxNF 


(9) 


Notice that the rank of each matrix in the left-hand side in ([9]) 
respectively equals to LF. This is because, by definition, Cic 
should have LF linearly independent rows as a transmitter 
sends out a linearly independent linear combination of packets 
per time slot. Furthermore, the concatenating matrix of all 
decoding matrices also has LF linearly independent 
columns to employ received vector y E F^ in decoding. 
We denote the concatenating matrix of all decoding matrices 
by B = [Bi,...,Bx] T . From the rank inequality, the rank 
of the product of the two matrices is upper bounded by 


rk(BCic) < min {rk(B) , rk(Cic)} = LF. ( 10 ) 


Furthermore, applying Sylvester’s rank inequality, we obtain 
the lower bound on the rank as 


rk (BCic) > rk (B) + rk(Cic) - LF = LF. ( 11 ) 


As a result, we conclude that the rank of the matrix in the 
right-hand side in 0 equals to LF , namely, 


LF = rk 


Ri + Aj Si 

R k + A^S k _ 


( 12 ) 


Since we are interested in finding the minimum LF , the 
optimal index code length /?£({S&, Hk}k=i) = min LF 
is obtained by minimizing the rank of the matrix in © 
with respective to over all possible indeterminate elements in 
{A k}k=i- Consequently, the minimum index code length is 
obtained by solving the optimization problem stated in ([4]). ■ 
Theorem 1 shows that the optimal linear index code length 
is determined by two factors: 1 ) the set of the packet index 
matrices {R fe } and 2 ) the set of side-information encoding 
matrices {S^}. Furthermore, the derived minrank expression 
in 0 is useful to design the optimal index coding matrix Cic 
with the rank of /JJ. This is because, under the premise that 
{B k} is predefined as B^ = I|r fc |F f° r k E /C, it is possible to 
attain the optimal index coding matrix with rank /?£, C^ c , by 
arbitrary selecting a set of the /3J linearly independent rows 
in ([5]) with {A£}. Therefore, the index coding matrix can be 
obtained by carefully completing the indeterminate elements in 
{A£} so that they provide the minimum rank of the resultant 
matrix. This motivates us to design an algorithm that finds the 
index coding matrix via a matrix completion approach, which 
will be explained in Section IV. 


To shed further light on the significance of Theorem [T] it 
is instructive to consider certain special cases and examples. 

A special case is when F = 1, N = K, and M\ = 1. 
User k E {1,2,..., iV } requests packet Xk E F 2 with one bit 
file size, i.e., Tk = {k} and \Tk\ = 1. Therefore, the packet 
requested by user k is simply written as a unit vector whose 
k- th element is one, R k = . Further, we assume that the 

memory size of user k is one bit, i.e., Mk = 1 for Vfc. Then, 
the coded side-information generating matrix becomes a vector 
s k £ F<f. In this reduced setup, the optimal index code length 
is stated in the following corollary. 

Corollary 1. When N = K and F = Mk = 1, the optimal 
scalar linear index code length is obtained by solving the 
following optimization problem 


/ 32 ({s fc ,e fe }f =1 ) = min rk 
ai,...,a,K 


rp rp 

e 1 + aiS! 


T T 

&K T a K&K 


= minrk (I# + SA) , 

A 


(13) 


where S = [si,..., s^] and A = diag[ai, a 2 ,..., ax] T . 

Proof: Without loss of generality, we assume that user k 
desires to decode file Xk, i.e., R k = with side-information 
Uk = s^x. Then, from Theorem 1, the optimal index code 
length is obtained by solving the problem stated in ( |T3j ). ■ 

Example 1 (Optimal Side-Information Encoding Struc¬ 
ture): For the given = 1, if we choose S = JrFIr where 
Jiv^F^ xK isa all-ones matrix, then the rank of the matrix 
I k + SA becomes one as I k + SA = I K J K + I K = J K 
and rank(Jk) = 1 . As a result, we conclude that the optimal 
index code with the length one is achievable if and only if 
the product of the side-information encoding matrix and the 
free variables in A (decoding matrix) has a particular structure 
of SA = Jk + Ik- This confirms the intuition that if each 
user knows XORed information of all packets excepting for 
its desired one as side-information, it is possible to satisfy all 
users by sending XORed information of all packets within one 
channel use. 

Example 2 (Connection to Index Coding with Uncoded 
Side-Information): Let us consider the following index 

coding problem where K = N = 5 and F = 1. User 
k E {1,..., 5} desires to decode Xk E F 2 with the set of 
uncoded side-information as follows: 

• User 1 has x 2 and £ 5 , i.e., Si = [e 2 es] T E F^ 5 , 

• User 2 has x\ and x 5 , i.e., S 2 = [ei e 5 ] T E F^ 5 , 

• User 3 has x 2 and x±, i.e., S 3 = [e 2 ef\ T E F^ 5 , 

• User 4 has x 2 and £ 3 , i.e., S 4 = [e 2 es] T E F^ 5 , 

• User 5 has xi, £ 3 , and X 4 , i.e., S 5 = [ei e 3 e 4 p E F^ 5 . 

Note that since the side-information is uncoded, each row 
in side-information matrix S k contains a non-zero element. 
Denoting A^ = [a\,a\\ E F ^ 1 for k E {1,..., 4} and 
A^ = [<4, < 25 ,^] E ¥^ xl , from Theorem 1, we are able 
to find the optimal index coding matrix Cic by solving the 
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following optimization problem: 


^uncoded _ min rk 


Now, we consider the same index coding problem but side- 
information is coded as follows: 

• User 1 has x 2 + £ 5 , i.e., si = [e 2 + e 5 ] T G F^* 5 , 

• User 2 has x% + £5, i.e., S 2 = [ei + es] T G F^* 5 , 

] t gF* x5 , 

lT g f ix 5 ? 

IT G F lx5 


two because 


11 + A^si 

12 + S2 


User 3 has x^ + £4, i.e., S3 = [e 2 + ^ jr 2 

User 4 has X2 + x$, i.e., S4 = [e2 + es] T G F^ 


User 5 has X 1 +X 3 +X 4 , i.e., S 5 = [ei+e 3 +e 4 p C i ' 2 


Since each user has coded side-information, unlike the un¬ 
coded case, the decoding matrix for user k becomes A^ = 
[dk] G Fr;. As a result, using Theorem 1, the optimal index 
coding matrix for the coded case is obtained by solving the 
following optimization problem: 


ft 


coded 
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= 2 . 


(18) 


(15) 


From © and we observe that the two minrank opti¬ 
mization problems are equivalent, provided that, in (14), the 
additional constraints are imposed on indeterminate elements 
per each rows such that a 3 k = a\ for j / i. Intuitively, 
for the uncoded case, we are able to exploit different side- 
information separately in decoding, which provides more 
degrees of freedom to choose the indeterminate elements. For 
the coded case, however, the side-information is only used in 
the coded fashion in decoding, which imposes the constraints 
on the indeterminate elements. 

Example 3 (Coding over Side-Information in a Caching 
Problem): Suppose the case of N = K = 2 in which a 
server intends to deliver two files with file size F = 2, i.e., 
xi = [ai,a 2 ] T and X 2 = [&i,& 2 ] t to the user 1 and 2, each 
with one bit cache memory FMk = 1. The coded caching 
method proposed in 0 is to store a\ + 61 and a 2 + 62 to 
user 1 and user 2 in the caching phase. During the delivery 
phase, the transmitter sends b\ and a 2 over two channel uses 
to satisfy the users’ request. 

In our framework, this caching method can be realized by 
choosing the side-information matrices such that 


(16) 


Since user 1 and user 2 desire to decode file xi and X 2 , the 
requested packet index matrices are 


As shown in the rows of the resultant matrix in ( [18] ), there 
are two possible transmission schemes that satisfy the users’ 
requests within two channel uses. The methods are to choose 
the index coding matrix Cic as either the first two rows in 
( [18] ) or the third and last rows in fj~8] ). 

IV. Algorithm via Matrix Completion 

In this section, we propose a index code design algorithm 
that leverages the minrank expression derived in Theorem 
1. From Theorem 1, we observed that the optimal linear 
index coding matrix can be obtained by solving a matrix 
completion problem over a finite field. It is notable that 
the matrix competition problem to minimize the rank of the 
resultant matrix in ([4| is different from the conventional matrix 
completion problems in (TO), (TTJ. This discrepancy comes 
from the fact that, in our problem, an indeterminate element in 
{A£} affects the multiple entries in the resultant matrix in ([ 4 ]), 
which is not the case for the conventional matrix completion 
problems. 

Using Theorem 1, we propose a greedy random search 
algorithm that finds a linear index coding matrix with the 
rank of /3. The proposed algorithm initially computes the 
rank of the matrix over F 2 , assuming that A^ is a matrix of 
zeros. Then, the algorithm runs over until no rank change is 
detected over V iterations in sequence. For the p -th iteration, 
we first update the elements of A^ randomly according to 
Bernoulli distribution with parameter T. Subsequently, we 
compute the rank of the matrix with the updated matrices 
{A^} and store them if the rank decreases compared to the 
previous one. The algorithm is summarized in Table |T] In 
the proposed algorithm, the number of iterations U plays a 
role in balancing between the performance and complexity. 
On the one hand, when the number of iterations U is chosen 
sufficiently large, the proposed algorithm is able to yield the 
optimal minimum rank with high probability. On the other 
hand, when U is not large enough, the probability that the 
algorithm reaches the minimum rank becomes low by reducing 
the computational complexity. Furthermore, the parameter T 
controls the likelihood that side-information matrix {S^} is 
used in decoding. This is because indeterminate elements 
in {A/c} is multiplied with the elements in side-information 
matrix S&. 

To verify the performance of the proposed algorithm, we 
consider the coded side-information case in Example 2. 


. (17) 


By selecting the indeterminate elements as Aj = [1,0] T and 
A 2 = [0,1] T , we obtain the minimum index code length with 
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TABLE I 

The proposed greedy randomized index code design algorithm 


V. Conclusion 


Algorithm 


Input 

Output 


{H k }, {S k } for keK,U, and T 
Cic 


Initialization 


Set A£ = 0 for k G /C 
Set p := 0, u := 0, and T := t; 


Compute (3 =: rk 


Ri + A* J Si 
Rk + a ^ T Sk 


While 


end 


u < U 

1. p =: p+ 1 

2. Update A k (i,j) = rand > T for 


( 


3. Compute r p =: rk 


Ri + Aj T Si 


R k +A^ t S k 


4. Update /3 =: r p , A£ := A^, and u, = 0, if r p < (3 
Update u =: u + 1, if r p > (3 


Ri + A^ T Si 


Cic=: Select (3 independent rows in 



In this letter, we studied a class of index coding problems 
with coded side-information. The optimal binary linear index 
code length is characterized in terms of the minrank expression 
of a matrix using an algebraic approach. By leveraging the 
derived minrank expression, we proposed a simple algorithm 
that solves a matrix completion problem to design index codes. 
The analytical minrank expression derived in this letter can 
be applied to design caching algorithms in many content 
distribution systems. 
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Fig. 2. Average minrank of the proposed algorithm accoding to the different 
parameters of U and T. 


Note that the minimum rank of the matrix in ( [T9] ) is two, which 
is obtained when every side-information is used in decoding, 
i.e., ctk = 1 for k G {1,..., 5}. Applying the proposed random 
greedy search method, we are able to compute the average of 
the rank E[/?] as a function of the number of iterations U. As 
illustrated in Fig. [2] when T = 0.1 (the probability that each 
user does not use coded side-information in decoding), the 
proposed algorithm achieves the minrank of 2 almost surely 
by searching U = 3 points randomly among 2 5 full search 
space. Furthermore, once the solutions = 1 are determined, 
we obtain the index coding matrix by arbitrary selecting two 
independent rows (2 and 3) in the matrix ( [19] ). As a result, it 
is possible for all users to decode Xk if the transmitter sends 
x\ + X 2 + X 5 and + £3 + £4 with two channel uses. 



























